Dropout as a Bayesian Approximation: Insights and Applications
نویسندگان
چکیده
Deep learning techniques are used more and more often, but they lack the ability to reason about uncertainty over the features. Features extracted from a dataset are given as point estimates, and do not capture how much the model is confident in its estimation. This is in contrast to probabilistic Bayesian models, which allow reasoning about model confidence, but often with the price of diminished performance. We show that a multilayer perceptron (MLP) with arbitrary depth and non-linearities, with dropout applied after every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model. This interpretation offers an explanation to some of dropout’s key properties, such as its robustness to over-fitting. Our interpretation allows us to reason about uncertainty in deep learning, and allows the introduction of the Bayesian machinery into existing deep learning frameworks in a principled way. Our analysis suggests straightforward generalisations of dropout for future research which should improve on current techniques.
منابع مشابه
Uncertainty in Deep Learning
Deep learning has attracted tremendous attention from researchers in various fields of information engineering such as AI, computer vision, and language processing [Kalchbrenner and Blunsom, 2013; Krizhevsky et al., 2012; Mnih et al., 2013], but also from more traditional sciences such as physics, biology, and manufacturing [Anjos et al., 2015; Baldi et al., 2014; Bergmann et al., 2014]. Neural...
متن کاملA Theoretically Grounded Application of Dropout in Recurrent Neural Networks
Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This...
متن کاملDropout as a Bayesian Approximation: Appendix
We show that a neural network with arbitrary depth and non-linearities, with dropout applied before every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model. This interpretation might offer an explanation to some of dropout’s key properties, such as its robustness to overfitting. Our interpretation allows us to reason about uncertainty in deep learning...
متن کاملEstimation of the Parameters of the Lomax Distribution using the EM Algorithm and Lindley Approximation
Estimation of statistical distribution parameter is one of the important subject of statistical inference. Due to the applications of Lomax distribution in business, economy, statistical science, queue theory, internet traffic modeling and so on, in this paper, the parameters of Lomax distribution under type II censored samples using maximum likelihood and Bayesian methods are estimated. Wherea...
متن کاملDropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Deep learning has gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty. Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost. We show that dropout in neural networks (NNs) can be cast as a Bayesian approximation....
متن کامل